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1 Motivation 

Truth based entailments are not sufficient for a good comprehension of NL. In 
fact, it can not deduce implicit information necessary to understand a text. On 
the other hand, norm based entailments are able to reach this goal. Let us 
consider this text PQ[2|: "the vehicle in front of me braked". Using a truth 
based approach; we can obtain all the logical consequences of a formula such as: 
(3v,t) Vehicle(v)Alnstant(t)Aln— Front— Of (v, 'me', t)f\break{v, t). While 
norms provide further conclusions like: v and me were in the same direction, no 
vehicle was between v and me, I had to brake when v braked . . . 
This idea was behind the development of Frames |3] and Scripts [S] [5] in the 
70's. But these theories are not formalized enough and their adaptation to new 
situations is far from being obvious. 

Actually, no repository of norms is available for a given domain. Moreover, 
norms are seldom made explicit in texts, because as Schank noticed, texts do 
not describe the normal course of events but focus rather on the description of 
abnormal situations. The motivation of the present work is to extract norms by 
detecting their violations in the texts. 

We are working on a corpus of 60 texts describing car crashes. For each text, 
we are searching the cause of the accident as perceived by a standard reader. 
We hypothesize that the perceived cause of an abnormal event is the violation 
of a norm (anomaly). Among all the anomalies evoked by a text, one of them 
is considered as 'primary'. It represents the most plausible cause of the acci- 
dent. The other anomalies result from the primary one and are called derived 
anomalies. 



2 The representation language 

In this work, we use a first order logic (FOL) representation language which 
takes into account modalities, time and non monotonicity.We only give here its 
main principles (see [2] for details): 

In order to quantify over properties, we use the technique of rcification com- 
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monly used in AI: a binary predicate P(X, Y) is written: Holds(P, X, Y) 1 . 
We decompose the scene into a succession of discrete states. Each one is charac- 
terized by a set of literals keeping a stable truth value. Thus, what we represent 
explicitly is the linear time of the events, as they really occurred. The form of 
the predicates is: Holds(P, A, t) where P is a property, A is an agent and t is a 
state number. 

In addition to truth-values, the texts introduce modalities. The technique of 
reification enables us to treat modalities as first order predicates. In our work, 
we use two main modalities: The first one is a kind of necessity. It expresses 
agent duties. Must(P, A, t) means that at state t, agent A must reach the 
property P. The second one is a kind of possibility. It expresses the capacities 
of the agents. Able — To(P, A, T) means that at state t, agent A is able to 
reach the property P. 

Inference rules are written in Reiter's default logic 4[. Material implications 
are written (A — ► B), normal defaults are written for short A : B and 
semi- normal ones A ' B ( ^ C are written A : B[C]. 
A primary anomaly comes under one of the two following forms: 

Must(P, A, t) A Able — To(P, A, t) A Holds (P' , A, t + 1) A 
Incompatible{P, P') — > An 
Holds(Combine(Disruptive-Factor, C), A, t) — » An 

Derived anomalies are expressed by: 

Must(P, A, t) A -nAble - To(P, A, t) A Holds(P', A, t + 1) A 
Incompatible(P, P') — > D — An 

3 From the text to the cause of the accident 
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Figure 1: The overall architecture of the reasoning system 



3.1 From the text to the semantic predicates 

First of all, a tagger and a syntactico-semantic analysis are applied to the text. 
The result of this step is a set of linguistic relations between relevant words of 
the text. The list of this type of relations, called linguistic predicates, is very 
short, namely: 

Subject(V, N),Object(V, N) : N is the subject (resp. object) of the verb V. 
Qualif — N(N, A),Qualif — V(V, A) : A is a qualification for the noun N 
(resp. the verb V). It is the case for example for adjectives and adverbs. 
Compl — N(X, N, Z),Compl — V(X, V, Z) : Z is a complement for the noun 
N (resp. the verb V). It is introduced by X. 

1 A predicate Q(X, Y, Z) is written : H olds(Combine(Q , X), Y, t). Combine(X, t) is a 
complex property 
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Support(X, Y) : X is a support for Y. For instance in "A est venu heurter B" (A 
comes to run up against B), we have the relation Support(venir, heurter). Let 
us consider the text : 

" J'etais a V arret au feu rouge lorsque le vehicule A m'a percute a I'arriere" (I 
was stopped on a red light, when vehicle A bumped on me with the back). We 
obtain : 

Subject(etre, J'), Compl — V(a, etre, arret), Compl — 
N(a, arret, feu), Qualif — N(feu, rouge), 
Compl — V(lorsque, etre, percuter) , Subject{per cuter, vehicule), Qualif — 
N(vehicule, A), Ob ject(per cuter, to'), Compl — V(a, percuter, arriere). 

After that, a non monotonic linguistic reasoning process transforms the linguis- 
tic predicates into a set of semantic predicates. For our example we obtain 
here: 

Holds(Stop, B, 1), Holds(Combine(Light, Red), A, 1), 
Holds(Combine(Bump, B), A, 2), H olds (Combine(S hock, back), B, 2). 

3.2 From the semantic predicates to the kernel 

Semantic predicates are supposed to represent the explicit semantic content of 
the text. The semantic reasoning process uses inference rules to enrich the set of 
semantic predicates extracted from a text by adding further implicit conclusions. 
The inference rules are based on our common knowledge about the norms of 
the domain of car crashes. 
The kernel contains six (reified) predicates: 

Stop, Control, RunSlowly -Enough, Start, Move-Back, Combine{Disruptive-F actor, C). 

Computing extensions of a first order semi-normal default theory is intractable 
in the general case. To overcome this difficulty, one has to consider sub-sets of 
the theory in which some constraints must be satisfied. In the present work, 
predicates and rules are designed so that they can be stratified i.e. organized in 
layers such that the derivation of a predicate belonging to a given layer depends 
only on the upper layers. Formally, the stratification constraints is verified if : 
(L(P) denotes the number of the layer containing the predicate P). 
Each implication A — > B, (resp. a normal default A : B) verifies : L(A) > L(B). 
Semi normal defaults A: B[C] verify : L(A) > L(B) and L(C) > Max(L(A), L(B)) 
Notice that rules belonging to the layer of number L(B) are those having B as 
conclusion. 

In our system, the stratification is applied in two levels. In the first level the 
stratification is based on the modalities. Four layers are identified. The first layer 
contains predicates with the empty modality (Holds(P, A, t)). The second one 
is constituted by duty predicates (Must(P, A, t)). In the third one we find 
predicates of capacity (Able — To(P, A, t j). Finally the last layer contains the 
two predicates An (primary anomaly) and D — An (derived anomaly). 
The second level of stratification concerns the predicates of the two first lay- 
ers (corresponding to empty and duty modalities). In each of these layers, we 
establish an order to the predicates so that the constraints of stratification are 
verified. We obtained 10 sub-layers in the first layer, and 2 sub-layers in the 
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second one. We have checked manually the validity of this method on a signifi- 
cant part of the corpus, and we are developing an automatic reasoning system 
to validate automatically the obtained results. 

Considering our example, in the first layer, the involved predicates are ordered 
in the following 8 sub layers (two sub layers are not used in this example): 

{Bump, Shock}, {Shock}, {Stop, Avoid, Obstacle}, {Control}, {predictable}, 
{Same-File, Follow}, {StopJJause}, {Cause -Later Stop}. 

Among others, the following inference rules of this layer are applied: 

Holds(Combine(Bump, V), W, t) — > ->Holds(Stop, W, t) {sublayer 3). 

Holds(Combine(Shock, V), W, t) A Holds{Combine{ShockJ J os, V), Back, t) : 
Holds(Combine(Follow, V), W, t-1) [Holds (Control, W, t-1)] {sublayer 6). 

The first rule means that if V bumps into W at state t, then V is not stopped 
at this state t. It enables us to infer -^Holds(Stop, A, 2)(V = B, W = 
A, t = 2). The second rule states that, in general, if there is a shock be- 
tween V and W at state t, and the position of the shock of V is its back, then 
generally W was following V in the same file. From this rule we can obtain 
Holds{Combine{Follow, B), A, 2)(V — B, W = A, t = 2). 
We deduce: Holds(Combine(Follow, B), A, 1) by applying the backward per- 
sistence rule for the predicate Follow: 

Holds(Combine(Follow, V), W, t) : Holds(Combine(Follow, V), W, t-1) 
An example of inferring duties in layer 2, is the rule : 

Holds(Combine(Follow, V), W, t) A Holds(Stop, V, t) -> Must(Stop, W, t) (sublayer2) 

If W follows V in a file at state t, and V stops at this state, then W must stops at 
state t too. From this rule, we infer: Must(Stop, A, 1)(V — B, W — A, t = 1). 
Finally to determine if A is able or not to avoid the shock, use the general rule 

Able - 

To{P, V, t) <-> (3Act) Action(Act) A Pcb(P, Act) wedge Avail able (Act, P, V, t) 

At state t, V is able to reach P if and only if there is an action Act which is a 
potential cause for P(Pcb(P, Act)) and Act is available for V to reach P at state 
t. Knowing the fact Pcb(Brake, Stop) and that by default any potential cause 
of an effect is available, we obtain Available(Brake, Stop, A, 1). Consequently 
we deduce Able - To(Stop, A, 1). 

Finally, by applying the first form of primary anomalies, we can detect the cause 
of the accident: 

" A did not stop in a situation in which it had to do. " 
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4 Conclusion and Perspectives 



In the present work, we propose a norm based reasoning system able to detect 
the causes of the accidents from their textual descriptions. The cause is seen 
as a violation of a norm. The study we have done enabled us to determine a 
limited number of semantic predicates (50) and inference rules (currently 150). 
In a short and medium term perspective, we will finish the implementation of 
the automatic reasoning system based on the idea of stratification, complete the 
design of remaining inference rules and validate our approach on new car crash 
reports. In a longer term perspective, we will try to generalize our methodology 
to other domains and we will explore the possibility of applying our approach 
to propose a norm base indexation of textual documents. 
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